SQL Server 2008 : Indexing for Performance - Putting It All Together (part 3) - Covering Your Queries

11/30/2010 3:36:07 PM

4. Covering Your Queries

Now that you have your clustered indexes created, think about what will happen if you want to retrieve all of the orders in a given time frame. Let's run the following query. Figure 5 shows the execution plan.

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh
WHERE soh.OrderDate > '2003-12-31'

Figure 5. The execution plan of a query that retrieves a date range of orders

From looking at the execution plan in Figure 5 , you can see that the query optimizer decided to do a clustered index scan to retrieve the data. That's the same as scanning the entire table. The query optimizer decided that it was faster to scan the entire table than to use any of the indexes defined on the table. Of course, the only index that we've defined so far is not useful for the query we've just executed.

We can improve things by helping the query optimizer out. Execute the code example in Listing 10-4 to create a nonclustered index on the OrderDate column. Then the listing executes the query that was ran previously. Listing 2 adds an additional query to the listing to prove another point. You should see the same execution plan as shown in Figure 6.

Example 2. SQL Script to Create Nonclustered Index and Query the Data for a Given Date Range

CREATE NONCLUSTERED INDEX ix_OrderDate ON apWriter.SalesOrderHeader(OrderDate)

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh
WHERE soh.OrderDate > '2003-12-31'

SELECT OrderDate,soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM AdventureWorks2008.apWriter.SalesOrderHeader soh
WHERE soh.OrderDate = '2004-01-01'

Figure 6. The execution plan from the same query on which Figure 5 was based, but this time with a supporting index in place

After running the first query, you should have noticed that the query optimizer still uses the clustered index to retrieve the data in the date range and the nonclustered index for an equal query. Why is that? To help answer that question, let's examine the execution plan of the second query in Listing 3 . The query optimizer determines that it is faster to use the nonclustered index to find the specific order date and then do a lookup for the additional information required to fulfill the query results. Remember, the nonclustered index only contains the nonclustered index key and a row locator for the remainder of the data. The queries are looking for the subtotal, the tax amount, and the total due for an order. So when the query is looking for specific indexed data, like the second query, the query optimizer can quickly find that information using the nonclustered index.

On the other hand, the first query is looking for a range of data. The query optimizer decided that with the number of records that will be retrieved, it's faster to scan the clustered index for the date range since no lookup is required. To prove the cost difference between the two indexes, the query in Listing 10-5 forces the query optimizer to use the nonclustered index on the order date for the first query. Figure 10-25 shows the resulting execution plan.

Example 3. SQL Script That Forces SQL Server to Use the Nonclustered Index for the Date Range Scan

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh with(index(ix_OrderDate))
WHERE soh.OrderDate > '2003-12-31'

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh
WHERE soh.OrderDate > '2003-12-31'

Figure 7. The difference between using the nonclustered index versus the clustered index for querying the OrderDate range

As you can see, the use of the nonclustered index included the bookmark lookups, and that plan is more costly than just scanning the clustered index. One thing you can do to increase the performance impact of the nonclustered index is to cover the query completely. Listing 4 shows an example of a nonclustered index created on OrderDate that includes the subtotal, the tax amount, and the total due columns.

To show the cost difference between using each index, the code forces the use of each of the different indexes that we've created. That way, you can see the difference between using a clustered index, the nonclustered index on order date, and the nonclustered index on order date with the include columns. Figure 8 shows the resulting execution plans. Clearly, you can see that the nonclustered index with include columns covering the query provides the most efficient method for accessing the data.

Example 4. SQL Script That Creates a Nonclustered Index and Uses the Different Indexes That Exist on the Table

CREATE NONCLUSTERED INDEX ix_OrderDatewInclude
ON apWriter.SalesOrderHeader(OrderDate) INCLUDE (SubTotal, TaxAmt, TotalDue)

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh WITH (index(ix_OrderDate))
WHERE soh.OrderDate > '2003-12-31'

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh WITH(index(ix_OrderDatewInclude))
WHERE soh.OrderDate > '2003-12-31'

SELECT soh.SalesOrderID,soh.SubTotal, soh.TaxAmt, soh.TotalDue
FROM apWriter.SalesOrderHeader soh WITH(index(ix_SalesOrderId))
WHERE soh.OrderDate > '2003-12-31'